Neural Network Models for Speech Recognition in Mobile Environment
نویسنده
چکیده
Neural networks (NN) are well known for capturing the complex distributions present in data. The ability of the model depends on the structure of the network and the nature of the data used for training [1]. In this paper, we are exploring neural network models for digit recognition in mobile environment. Due to recent advancements and applications in mobile communication, there is a need to develop the speech interface for the mobile devices. The sophisticated speech interface cannot be developed on mobile devices due to the limited availability of computation power, data storage, and power supply (battery). Hence, in mobile environment the speech interface will be realized by distributed computing: that is the speech signal captured by mobile devices is routed to the server at remote location, where the actual speech recognition or synthesis will takes place. Due to limited bandwidth available to mobile channels the speech signal captured by the mobile device is compressed by appropriate speech coder before sending to the remote location [2]. Here, speech recognition is performed on decoded speech. In this paper, we are examining the recognition performance for different coding rates. It is known that the decoded speech is perceptually similar to original speech, but at the signal level it is different from the original speech [2]. In this work, we are exploring neural network models for speech recognition under different coding rates with the hypothesis that these models will capture the perceptual and cognitive information from speech.
منابع مشابه
شبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملSpeech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملNavigation of a Mobile Robot Using Virtual Potential Field and Artificial Neural Network
Mobile robot navigation is one of the basic problems in robotics. In this paper, a new approach is proposed for autonomous mobile robot navigation in an unknown environment. The proposed approach is based on learning virtual parallel paths that propel the mobile robot toward the track using a multi-layer, feed-forward neural network. For training, a human operator navigates the mobile robot in ...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009